Rank in Wordlist | Frequency | Word |
---|---|---|
5472 | 330 | 1,5 |
7108 | 242 | 2,5 |
8728 | 188 | 3,5 |
8811 | 186 | iii,v |
9579 | 168 | i,ii,iii,iv,v |
11346 | 133 | 1,2 |
11409 | 132 | 4,5 |
12564 | 116 | 0,5 |
14134 | 99 | 1,3 |
15979 | 84 | 6,5 |
Rank in Wordlist | Frequency | Word |
---|---|---|
53684 | 12 | .) |
Rank in Wordlist | Frequency | Word |
---|---|---|
2652 | 745 | 100% |
4513 | 412 | 50% |
6350 | 277 | 10% |
7169 | 239 | 20% |
7228 | 237 | 30% |
8061 | 207 | 80% |
8798 | 186 | 40% |
10893 | 141 | 15% |
11000 | 139 | 5% |
11594 | 129 | 70% |
Rank in Wordlist | Frequency | Word |
---|---|---|
12735 | 114 | NCT&I |
14846 | 93 | R&D |
36918 | 23 | R&B |
45681 | 16 | H&M |
57246 | 11 | P&O |
68105 | 8 | Fill&Serve |
72792 | 7 | AIR&RAIL |
72965 | 7 | B&B |
73935 | 7 | Johnston&Cie |
74474 | 7 | Night&Day |
Rank in Wordlist | Frequency | Word |
---|---|---|
164753 | 2 | US$/t |
193852 | 1 | 12$/lb |
199977 | 1 | 30.000$/T |
204849 | 1 | 9/10.000$/T |
281405 | 1 | U$/ha/an |
281632 | 1 | US$/XPF |
281633 | 1 | US$/livre |
313242 | 1 | e$4e8Ty |
Rank in Wordlist | Frequency | Word |
---|---|---|
1442 | 1349 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
76 | 16275 | d'un |
83 | 14709 | d'une |
206 | 6935 | c'est |
209 | 6818 | n'est |
249 | 5658 | C'est |
272 | 5363 | qu'il |
359 | 4231 | n'a |
578 | 3006 | s'est |
646 | 2709 | jusqu'à |
873 | 2114 | qu'elle |
Rank in Wordlist | Frequency | Word |
---|---|---|
20830 | 57 | CANAL+HD |
25445 | 42 | R+1 |
33402 | 27 | Bac+2 |
37897 | 22 | N+1 |
41094 | 19 | BAC+2 |
41397 | 19 | R+2 |
49785 | 14 | R+3 |
50061 | 14 | bac+5 |
51432 | 13 | Bac+5 |
53934 | 12 | Cine+Premier |
Rank in Wordlist | Frequency | Word |
---|---|---|
1314 | 1470 | et/ou |
2951 | 666 | km/h |
5207 | 349 | Risqueshttps://securite-civile |
7428 | 230 | https://www |
10077 | 156 | 1/2 |
11657 | 128 | 24h/24 |
12481 | 117 | F/l |
14958 | 92 | Nouvelle-Calédoniehttp://webmariotti |
15434 | 88 | 2/3 |
16117 | 83 | 7j/7 |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots